AITopics

Country: North America > United States (0.94)

Genre: Research Report (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.54)

Technology:

Information Technology > Biomedical Informatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Bergmeister, Andreas, Lal, Manish Krishan, Jegelka, Stefanie, Sra, Suvrit

A projection-based framework for gradient-free and parallel learning

arXiv.org Artificial IntelligenceJun-9-2025

We present a feasibility-seeking approach to neural network training. This mathematical optimization framework is distinct from conventional gradient-based loss minimization and uses projection operators and iterative projection algorithms. We reformulate training as a large-scale feasibility problem: finding network parameters and states that satisfy local constraints derived from its elementary operations. Training then involves projecting onto these constraints, a local operation that can be parallelized across the network. We introduce PJAX, a JAX-based software framework that enables this paradigm. PJAX composes projection operators for elementary operations, automatically deriving the solution operators for the feasibility problems (akin to autodiff for derivatives). It inherently supports GPU/TPU acceleration, provides a familiar NumPy-like API, and is extensible. We train diverse architectures (MLPs, CNNs, RNNs) on standard benchmarks using PJAX, demonstrating its functionality and generality. Our results show that this approach is as a compelling alternative to gradient-based training, with clear advantages in parallelism and the ability to handle non-differentiable operations.

evolutionary algorithm, machine learning, projection, (20 more...)

2506.05878

Country: Europe (0.93)

Genre: Research Report > New Finding (0.86)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Robert Nishihara, Stefanie Jegelka, Michael I. Jordan

On the Convergence Rate of Decomposable Submodular Function Minimization

Neural Information Processing SystemsFeb-9-2025, 15:00:18 GMT

Submodular functions describe a variety of discrete problems in machine learning, signal processing, and computer vision. However, minimizing submodular functions poses a number of algorithmic challenges. Recent work introduced an easy-to-use, parallelizable algorithm for minimizing submodular functions that decompose as the sum of "simple" submodular functions. Empirically, this algorithm performs extremely well, but no theoretical analysis was given. In this paper, we show that the algorithm converges linearly, and we provide upper and lower bounds on the rate of convergence. Our proof relies on the geometry of submodular polyhedra and draws on results from spectral graph theory.

algorithm, artificial intelligence, optimization problem, (16 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Illinois (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Xianghang Liu, Justin Domke

Projecting Markov Random Field Parameters for Fast Mixing

Neural Information Processing SystemsFeb-8-2025, 21:01:24 GMT

The flaw in practice is that it can take a large and/or unknown amount of time to converge to the stationary distribution. This paper gives sufficient conditions to guarantee that univariate Gibbs sampling on Markov Random Fields (MRFs) will be fast mixing, in a precise sense. Further, an algorithm is given to project onto this set of fast-mixing parameters in the Euclidean norm. Following recent work, we give an example use of this to project in various divergence measures, comparingunivariatemarginals obtained by sampling after projection to common variational methodsandGibbs sampling on the original parameters.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: Oceania > Australia > New South Wales (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Agrawal, Akshaya, Mayer, Parker, Kingston, Zachary, Hollinger, Geoffrey A.

Constrained Nonlinear Kaczmarz Projection on Intersections of Manifolds for Coordinated Multi-Robot Mobile Manipulation

arXiv.org Artificial IntelligenceOct-28-2024

Cooperative manipulation tasks impose various structure-, task-, and robot-specific constraints on mobile manipulators. However, current methods struggle to model and solve these myriad constraints simultaneously. We propose a twofold solution: first, we model constraints as a family of manifolds amenable to simultaneous solving. Second, we introduce the constrained nonlinear Kaczmarz (cNKZ) projection technique to produce constraint-satisfying solutions. Experiments show that cNKZ dramatically outperforms baseline approaches, which cannot find solutions at all. We integrate cNKZ with a sampling-based motion planning algorithm to generate complex, coordinated motions for 3 to 6 mobile manipulators (18--36 DoF), with cNKZ solving up to 80 nonlinear constraints simultaneously and achieving up to a 92% success rate in cluttered environments. We also demonstrate our approach on hardware using three Turtlebot3 Waffle Pi robots with OpenMANIPULATOR-X arms.

artificial intelligence, constraint-based reasoning, manifold, (15 more...)

2410.2163

Country: North America > United States > Oregon (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.66)

Barlaud, Michel, Perez, Guillaume, Marmorat, Jean-Paul

A new Linear Time Bi-level $\ell_{1,\infty}$ projection ; Application to the sparsification of auto-encoders neural networks

arXiv.org Artificial IntelligenceJul-23-2024

The $\ell_{1,\infty}$ norm is an efficient-structured projection, but the complexity of the best algorithm is, unfortunately, $\mathcal{O}\big(n m \log(n m)\big)$ for a matrix $n\times m$.\\ In this paper, we propose a new bi-level projection method, for which we show that the time complexity for the $\ell_{1,\infty}$ norm is only $\mathcal{O}\big(n m \big)$ for a matrix $n\times m$. Moreover, we provide a new $\ell_{1,\infty}$ identity with mathematical proof and experimental validation. Experiments show that our bi-level $\ell_{1,\infty}$ projection is $2.5$ times faster than the actual fastest algorithm and provides the best sparsity while keeping the same accuracy in classification applications.

algorithm, neural network, projection, (17 more...)

2407.16293

Country:

North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Greece (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Perez, Guillaume, Barlaud, Michel

Multi-level projection with exponential parallel speedup; Application to sparse auto-encoders neural networks

arXiv.org Artificial IntelligenceJul-4-2024

The $\ell_{1,\infty}$ norm is an efficient structured projection but the complexity of the best algorithm is unfortunately $\mathcal{O}\big(n m \log(n m)\big)$ for a matrix in $\mathbb{R}^{n\times m}$. In this paper, we propose a new bi-level projection method for which we show that the time complexity for the $\ell_{1,\infty}$ norm is only $\mathcal{O}\big(n m \big)$ for a matrix in $\mathbb{R}^{n\times m}$, and $\mathcal{O}\big(n + m \big)$ with full parallel power. We generalize our method to tensors and we propose a new multi-level projection, having an induced decomposition that yields a linear parallel speedup up to an exponential speedup factor, resulting in a time complexity lower-bounded by the sum of the dimensions, instead of the product of the dimensions. we provide a large base of implementation of our framework for bi-level and tri-level (matrices and tensors) for various norms and provides also the parallel implementation. Experiments show that our projection is $2$ times faster than the actual fastest Euclidean algorithms while providing same accuracy and better sparsity in neural networks applications.

algorithm, complexity, projection, (15 more...)

2405.02086

Country:

North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > France > Provence-Alpes-Côte d'Azur (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Neural Information Processing SystemsMar-13-2024, 11:16:33 GMT

On the Convergence Rate of Decomposable Submodular Function Minimization

algorithm, convergence, submodular function, (14 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Illinois (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Neural Information Processing SystemsMar-13-2024, 07:00:14 GMT

Projecting Markov Random Field Parameters for Fast Mixing

algorithm, piecewise kl, projection, (15 more...)

Country: Oceania > Australia > New South Wales (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

arXiv.org Artificial IntelligenceAug-18-2023

A Lightweight Transformer for Faster and Robust EBSD Data Collection

Dong, Harry, Donegan, Sean, Shah, Megna, Chi, Yuejie

Three dimensional electron back-scattered diffraction (EBSD) microscopy is a critical tool in many applications in materials science, yet its data quality can fluctuate greatly during the arduous collection process, particularly via serial-sectioning. Fortunately, 3D EBSD data is inherently sequential, opening up the opportunity to use transformers, state-of-the-art deep learning architectures that have made breakthroughs in a plethora of domains, for data processing and recovery. To be more robust to errors and accelerate this 3D EBSD data collection, we introduce a two step method that recovers missing slices in an 3D EBSD volume, using an efficient transformer model and a projection algorithm to process the transformer's outputs. Overcoming the computational and practical hurdles of deep learning with scarce high dimensional data, we train this model using only synthetic 3D EBSD data with self-supervision and obtain superior recovery accuracy on real 3D EBSD data, compared to existing methods.

artificial intelligence, machine learning, natural language, (17 more...)

2308.09693

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.82)

Industry:

Government (0.46)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)